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(57) ABSTRACT 

In a publish/subscribe data processing broker network hav- 
ing a plurality of broker data processing apparatuses each of 
which has an input for receiving published messages directly 
from a publisher application and/or receiving subscription 
data from a subscriber application, a first broker data pro- 
cessing apparatus has: a unit for receiving a data message 
published on a first topic by a first publisher application; and 
a unit for forwarding the received published data message to 
a subscriber application which has requested, by entering 
subscription data, to receive a message on the first topic; 
wherein the first broker data processing apparatus sends a 
declaration to at least one other broker data processing 
apparatus of said plurality of broker data processing appa- 
ratuses declaring that the first broker data processing appa- 
ratus is the only broker data processing apparatus that is 
directly communicating with a publisher application that is 
publishing on the first topic. 

7 Claims, 2 Drawing Sheets 
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PUBLISH AND SUBSCRIBE DATA 
PROCESSING APPARATUS, METHOD AND 
COMPUTER PROGRAM PRODUCT WITH 
DECLARATION OF A UNIQUE PUBLISHER 
BROKER 

FIELD OF THE INVENTION 

The present invention relates to the field of data process- 
ing and more specifically to data processing which distrib- 
utes messages from suppliers (called, hereinafter, 
"publishers") of data messages to consumers (called, here- 
inafter "subscribers") of such messages. 

BACKGROUND OF THE INVENTION 

Publish/subscribe data processing systems have become 
very popular in recent years as a way of distributing data 
messages from publishing computers to subscribing com- 
puters. The increasing popularity of the Internet, which has 
connected a wide variety of computers all over the world, 
has helped to make such publish/subscribe systems even 
more popular. Using the Internet, a World Wide Web 
browser application (the term "application" or "process" 
refers to a software program, or portion thereof, running on 
a computer) can be used in conjunction with the publisher or 
subscriber in order to graphically display messages. Such 
systems are especially useful where data supplied by a 
publisher is constantly changing and a large number of 
subscribers needs to be quickly updated with the latest data. 
Perhaps the best example of where this is useful is in the 
distribution of stock market data. 

In such systems, publisher applications of data messages 
do not need to know the identity or location of the subscriber 
applications which will receive the messages. The publish- 
ers need only connect to a publish/subscribe distribution 
agent process (the terms "distribution agent" and "broker" 
are used interchangeably herein), which is included in a 
group of such processes making up a broker network, and 
send messages to the distribution agent process, specifying 
the subject of the message to the distribution agent process. 
The distribution agent process then distributes the published 
messages to subscriber applications which have previously 
indicated to the broker network that they would like to 
receive data messages on particular subjects. Thus, the 
subscribers also do not need to know the identity or location 
of the publishers. The subscribers need only connect to a 
distribution agent process. 

One such publish/subscribe system which is currently in 
use, and which has been developed by the Transarc Corp. (a 
wholly owned subsidiary of the assignee of the present 
patent application, IBM Corp.) is shown in FIG. 1. Publish- 
ers 11 and 12 connect to the publish/subscribe broker 
network 2 and send published messages to broker network 
2 which distributes the messages to subscribers 31, 32, 33, 
34. Publishers 11 and 12, which are data processing appli- 
cations which output data messages, connect to broker 
network 2 using the well known interapplication data con- 
nection protocol known as remote procedure call (or RPC). 
Each publisher application could be running on a separate 
machine, alternatively, a single machine could be running a 
plurality of publisher applications. The broker network 2 is 
made up of a plurality of distribution agents (21 through 27) 
which are connected in a hierarchical fashion which will be 
described below as a "tree structure". These distribution 
agents, each of which could be running on a separate 
machine, are data processing applications which distribute 
data messages through the broker network 2 from publishers 
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to subscribers. Subscriber applications 31, 32, 33 and 34 
connect to the broker network 2 via RPC in order to receive 
published messages. 

Publishers 11 and 12 first connect via RPC directly to a 
5 root distribution agent 21 which in turn connects via RPC to 
second level distribution agents 22 and 23 which in turn 
connect via RPC to third level distribution agents 24, 25, 26 
and 27 (also known as "leaf distribution agents" since they 
are the final distribution agents in the tree structure). Each 
10 distribution agent could be running on its own machine, or 
alternatively, groups of distribution agents could be running 
on the same machine. The leaf distribution agents connect 
via RPC to subscriber applications 31 through 34, each of 
which could be running on its own machine. 

In order to allow the broker network 2 to determine which 
published messages should be sent to which subscribers, 
publishers provide the root distribution agent 21 with the 
name of a distribution stream for each published message. A 
distribution stream (called hereinafter a "stream*') is an 
ordered sequence of messages having a name (e.g., "stock" 
20 for a stream of stock market quotes) to distinguish the stream 
from other streams. Likewise, subscribers provide the leaf 
distribution agents 31 through 34 with the name of the 
streams to which they would like to subscribe. In this way, 
the broker network 2 keeps track of which subscribers are 
25 interested in which streams so that when publishers publish 
messages to such streams, the messages can be distributed to 
the corresponding subscribers. Subscribers are also allowed 
to provide filter expressions to the broker network in order 
to limit the messages which will be received on a particular 
30 stream (e.g., a subscriber 31 interested in only IBM stock 
quotes could subscribe to the stream "stock" by making an 
RPC call to leaf distribution agent 24 and include a filter 
expression stating that only messages on the "stock" stream 
relating to IBM stock should be sent to subscriber 31). 
35 The above-described publish/subscribe architecture pro- 
vides the advantage of central coordination of all published 
messages, since all publishers must connect to the same 
broker (the root) in order to publish a message to the broker 
network. For example, total ordering of published messages 
40 throughout the broker network is greatly facilitated, since 
the root can easily assign sequence numbers to each pub- 
lished message on a stream. However, this architecture also 
has the disadvantage of publisher inflexibility, since each 
publisher is constrained to publishing from the single root 
45 broker, even when it would be much easier for a publisher 
to connect to a closer broker. 

Accordingly, publish/subscribe software designers are 
beginning to consider architectures where publishers are 
allowed to publish messages directly to any broker in the 
50 broker network. This clearly has the advantage of removing 
the above-mentioned constraint on publishers. However, as 
with any tradeoff, it presents other problems. One of the 
major problems is that since a publisher can publish from 
any broker, subscription data (data indicating which sub- 
55 scribers have subscribed to which streams/topics) must be 
propagated throughout the broker network, as it cannot be 
determined from where a publisher on a particular topic/ 
stream will publish from. Propagating subscription data 
throughout the broker network is the only way (besides 
60 sending all published messages to every broker) to guarantee 
that published messages, from wherever they may be 
published, will make their way to the subscribers who have 
requested the messages. This requirement imposes a great 
strain on the broker network, as it not only presents a high 
65 data traffic level throughout the network but also the sub- 
scription data must be locally stored and maintained with 
respect to each broker in the broker network. 
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SUMMARY OF THE INVENTION publish a stock quote message to stream "stock", publisher 

A , it _ , • *■ . . 11 makes an RPC call to the root distribution agent 11 which 

According to one aspect, the present invention provides in , , , Ca , , 1 t . A . & A _ .. 

l/ f i_ j 7 • i_ i , i i_ • is at the top level of the broker network tree structure. In this 

a publish/subscnbe data processing broker network having a . r , ., r ( . . it _ 

, v * cut j* • i_ r example, subscriber application 32, running on another 

plurality of broker data processing apparatuses each of r * . , Li • *• * • nnry « \ 

./ . . f f. u J j- ii 5 computer, has sent a subscription request via an RPC call to 

which has an input for receiving published messages directly , c * A ' A . Af> . f\ - . ] A . . AA . < . , 

e ui* u v *• ■* t . . * * *• leaf distribution agent 24, which is at the bottom level of the 

from a publisher application and/or receiving subscription A A . ?. A . ' A , , 

j , * i_-if i- *■ ^*ui j* h*ee structure, indicating that subscriber 32 would like to 

data from a subscriber application, a first broker data pro- . i« , u A 6 . „ 

. . - . . j . subscribe to stream "stock", 
cessing apparatus comprising: means for receiving a data 

message published on a first topic by a first publisher Thus, whenever publisher 11 publishes a data message to 

application; and means for forwarding the received pub- 10 stream "stock" the distribution tree structure of broker 

lished data message to a subscriber application which has network 2 channels the message down through the root 

requested, by entering subscription data, to receive a mes- distribution agent 21, through any intermediary distribution 

sage on the first topic; wherein the first broker data process- a S ents ( e -&> 22 in toc example of FIG. 1) and through the 

ing apparatus sends a declaration to at least one other broker leaf distribution agent 24 to the subscriber 32. This involves 

data processing apparatus of said plurality of broker data 15 a s*™* of RPC ^ing made between each successive 

processing apparatuses declaring that the first broker data circle in ^ diagram of FIG. 1 connecting publisher 11 and 

processing apparatus is the only broker data processing subscriber 32 (i.e., 11 to 21, 21 to 22, 22 to 24 and 24 to 32). 

apparatus that is directly communicating with a publisher FIG. 2 shows a different publish/subscribe architecture 

application that is publishing on the first topic. where publisher applications can publish messages to the 

According to a second aspect, the present invention 2 ° broker network by directly communicating with any one of 

provides a data processing method having method steps a plurality of distribution agents (brokers). For example, 

corresponding to each element of the data processing appa- publisher application 201 is shown communicating directly 

ratus of the first aspect of the invention. ^ Broker 12. There is no requirement in this architecture 

According to a third aspect, the present invention pro- 25 ^ at a11 P^^cr applications communicate directly with a 
vides a computer readable storage medium having a cim- t0 P < or , rt > ot T d ^ b * 10 ° lisher application 201 
puter program stored on it which, when execufed on a ^P otcntia % communicate airectly with any of the distri- 
buter, carries out the fiinctionality of data processing ^ lon ^nts »hown m FIG. 2, in the described examples 
method of the second aspect of the invention. ^ low 11 wlU bc shown ^municatmg directly with Broker 

The present invention allows one broker in a network of 30 . 

t * i ♦ r*i» Subscriber applications 202 and 203 would like to receive 

such brokers, to bc declared as the unique source of taking _ 4 u • * ui - u i- <* M 

» i ■ » j . . A , c , . . to messages on the stream/topic that publisher application 201 

published messages into the network for a particular topic. ^ QQ ^ subscriber applicaUoi^ and 203 

Thus, with the present invention, since a publisher appli- communicate directly with Brokers 1112 and 1221, 

cation can be declared as the unique source of publications respectively, to provide subscription data thereto informing 

on a stated topic in the network, the problem that existed in 35 ^ broker hierarchy of their desire to receive such published 

the prior art of requiring subscription data to be propagated, mesS ages. Since the publisher application 201 is allowed to 

maintained and stored by each distribution agent throughout communicate directly with any of a plurality of distribution 

the broker hierarchy no longer exists. Specifically, the agentS7 the . subscription data entered by the subscriber 

problem no longer exists because there is no more uncer- applications must be propagated throughout the broker net- 

tainty regarding where a publisher application might publish 40 work to each Broker shown m pj G 2 . This way, no matter 

from. Thus, subscription data need only be propagated to which distribution agent the publisher application 201 hap- 

and maintained on distribution agents which are included in pens to commim i C ate directly with, the published messages 

a direct path between the unique broker source on the stated ^ be able to 5e routed to lhe subscriber applications 202 

topic and a subscriber which has subscribed to that topic. ^ ^ 2 03. As stated above, however, this creates a high 

BRIEF DESCRIPTION OF THE DRAWINGS performance overhead due to the excessive amounts of 

subscription data propagation traffic throughout the broker 

The invention will be better understood by referring to the network and due to the need to have to maintain and store 

detailed description of the preferred embodiments which such subscription data locally at each distribution agenL 

will now be described in conjunction with the following 5Q if a distribution agent (also referred to herein as a 

drawing figures: "broker") can be identified to the other distribution agents as 

FIG. 1 shows the architecture of a prior art publish/ the home to all publisher applications (e.g. by topic content 

subscribe broker network which was referred to above; and or a publisher flag) on a given topic, call this a unique 

FIG. 2 shows the architecture of a publish/subscribe publisher broker for simplicity, it is possible to restrict the 

broker network according to which the preferred embodi- ss subscription p ath in the hierarchy by halting the propagation 

ment of the present invention will be explained below. °f the subscription data once this unique publisher broker is 

reached. 

DETAILED DESCRIPTION OF THE To fy^j the 0 f subscription propagation in the 
PREFERRED EMBODIMENTS unique publisher broker case it is possible to remove sub- 
In the prior art FIG. 1 discussed above, a publisher 60 scriptions that have been propagated down branches of the 
application 11, running on one computer, is, for example, a hierarchy leading off the path between the subscriber and the 
supplier of live stock market data quotes. That is, publisher publisher that contain no subscriptions or the publisher on 
application 11 provides frequent messages stating the this topic, thus, reducing the subscriptions for a topic to only 
present value of share prices. In this example, publisher fie on the path(s) between the subscriber's (or subscribers') 
application 11 is publishing messages on a stream called 65 broker(s) and the publisher's broker, 
"stock" which has already been configured in the broker The first level of subscription data propagation restriction 
network 2. As is well known, .when publisher 11 wishes to prevents subscription data from flowing further once the 
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unique publisher broker is reached by the subscription data. 
When a subscription for a topic arrives at a unique publisher 
broker and the topic matches the topic on which this broker 
is the unique publisher broker, the unique publisher broker 
will not propagate the subscription any further through the 5 
hierarchy as it is known that no other broker can possibly 
publish on this topic. For example, if a new subscriber 
application 203 attaches to its nearest Broker 1221 and 
enters a subscription to a certain topic (e.g., IBM stock 
price), this subscription data identifying the new subscrip- 10 
tion will propagate up to Broker 122 and then further up to 
Broker 12 (which has previously declared itself to the other 
brokers as the unique publisher broker on the topic of IBM 
stock price). Broker 12 will then recognize that the sub- 
scription data's topic (IBM stock price) matches the topic ^ 
(IBM stock price) on which Broker 12 is the unique pub- 
lisher broker, and thus Broker 12 will not further propagate 
the subscription data to Broker 121 or Broker 1. 

The second level of limiting subscription data propagation 
is the removal of unnecessary subscriptions which has 20 
already been propagated to brokers, i.e., those subscriptions 
that do not lie on the path(s) between subscribers) and the 
unique publisher broker, once a new unique publisher broker 
is added to an existing broker hierarchy. Any unnecessary 
subscriptions can be identified by the fact that they would 25 
cause publications to flow in the opposite direction from 
those originating from the unique publisher broker, which is 
not possible for they would have to have originated from a 
publisher on another broker, and thus, the publisher broker 
could not be unique. 30 

The preferred embodiment involves the use of a special 
message (for example, a publication), call it a unique 
publisher broker message, this contains the topic concerned 
and the identity of the broker that has just sent this message. 
A broker receiving a unique publisher message will follow 35 
these rules: 

1) If this broker also claims to be a unique publisher broker 
on this same topic we have a situation where more than 
one broker in the hierarchy believe they are unique 
publishers on the same topic, this cannot be valid and an 40 
error is reported. Otherwise: The broker marks the topic 
that matches the one in the message as being a unique 
publisher topic. 

2) If the broker has a subscription from the broker that sent 
this message, the subscription can be removed. This is 45 
because the subscription could only be used if a publica- 
tion arrived at this broker and was to be propagated 
towards the broker sending the unique publisher message. 
This would cause publications to flow towards the pub- 
lisher which is not possible when the publisher is unique. 50 
The identity of the broker sending this message is 
replaced with the identity of the current broker and the 
message is then propagated to every relation known to 
this broker, except the one that originated the unique 
publisher message. 55 
Now we define the three scenarios that can cause a unique 

publisher message to be generated by a unique publisher 
broker and how they are handled: 

1) Subscriber applications subscribe to a topic by commu- 
nicating directly (e.g., via RPC) with one of the brokers, 60 
and the subscriptions (i.e., subscription data) are propa- 
gated to all brokers before a unique publisher has been 
identified. When a broker (e.g., Broker 12) declares that 
it is the unique publisher broker on this topic and 
subscriptions) already exist, the unique publisher broker 65 
(e.g., Broker 12) marks the topic as being unique and a 
unique publisher message is generated and sent to all 
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relations (meaning, all brokers that are direct neighbours) 
of this broker (e.g., Brokers 121, 122 and 1). By following 
the above rules this message will be propagated to all 
brokers and any redundant subscriptions will be removed 
from the hierarchy. 

2) Before any subscriptions are made, a publisher broker 
(e.g., broker 12) believes that it is a unique publisher 
broker on a certain topic (e.g, IBM stock price). A 
subscription to this topic then arrives at broker 12 from 
another broker (e.g., broker 1), once a subscriber appli- 
cation 202 has entered a subscription (e.g., by directly 
communicating the subscription data to broker 1112, 
which has resulted in corresponding subscription data 
propagating to brokers 111, 1111, 11, 112, 1121, 1 and 
finally to broker 12). At this point (when the subscription 
data reaches broker 12) we halt propagation of the sub- 
scription past broker 12, and broker 12 generates a unique 
publisher message and sends it to the broker 1 that sent the 
subscription data to broker 12. Again, by following the 
above rules this unique publisher message will be propa- 
gated from broker 1 to all brokers (i.e., 11, 112, 1121, 111, 
1112 and 1111) that have received the original subscrip- 
tion data. Then, the subscription data is removed from 
those brokers (i.e., 112, 1121, 1111) lying off the direct 
path between the unique publisher broker 12 and the 
subscriber application 202. 

3) A unique publisher broker 12 exists along with subscriber 
202 and a direct path (i.e., from subscriber 22 to broker 
1112 to broker 111 to broker 11 to broker 1 to broker 12) 
between them has been formed. Then, a new subscription 
(from a new subscriber 204, shown in dotted line, is made 
from a broker 1121 that lies in a branch off a direct path 
from the unique publisher broker 12 to an existing sub- 
scriber 202. When the new subscription data arrives at 
broker U (which is on the direct path mentioned above) 
and the topic of the subscription has been marked as a 
unique publisher topic and a subscription to this topic 
already exists it is now known that we have intercepted a 
direct path between a publisher and a subscriber. The 
propagation of the subscription is halted at broker 11 (Le., 
the subscription data has already propagated from broker 
1121 to broker 112 to broker 11), as a subscription to this 
topic would already have been propagated from broker 11 
to the unique publisher broker 12 due to the existing 
subscription. A unique publisher message is then gener- 
ated by broker 11 and sent back to the broker 112 that sent 
the new subscription. This is the same as the scenario 
above, only for a sub-tree of the broker hierarchy. 
While the preferred embodiment of the invention has been 

discussed in the context of a broker network made up of a 
hierarchy (e.g., designed from the top down) of distribution 
agents, the broker network need not be hierarchical. For 
example, the network could also be configured as a totally 
connected network, with each broker connected to every 
other broker (or some other combination of brokers less than 
every other broker). 
We claim: 

1. In a publish/subscribe data processing broker network 
having a plurality of broker data processing apparatuses 
each of which has an input for receiving published messages 
directly from a publisher application and/or receiving sub- 
scription data from a subscriber application, a first broker 
data processing apparatus comprising: 

means for receiving a data message published on a first 
topic by a first publisher application; and 

means for forwarding the received published data mes- 
sage to a subscriber application which has requested, 
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by entering subscription data, to receive a message on 
the first topic; 

wherein the first broker data processing apparatus sends a 
declaration to at least one other broker data processing 
apparatus of said plurality of broker data processing 5 
apparatuses declaring that the first broker data process- 
ing apparatus is the only broker data processing appa- 
ratus that is directly communicating with a publisher 
application that is publishing on the first topic. 

2. The apparatus of claim 1 wherein a second broker data 10 
processing apparatus, which is on a direct path between the 
first broker data processing apparatus and a subscriber 
application, sends the declaration on behalf of the first 
broker data processing apparatus upon receiving new sub- 
scription data from a new subscriber application to the first 15 
topic. 

3. The apparatus of claim 1 wherein upon receipt of the 
declaration subscription data is removed from broker data 
processing apparatuses that do not lie on a direct path 
between the first broker data processing apparatus and the 20 
subscriber application. 

4. The apparatus of claim 1 wherein the network is the 
Internet. 

5. The apparatus of claim 1 wherein at least one of the 
publisher application and the subscriber application runs in 25 
cooperation with a World Wide Web browser application. 

6. In a publish/subscribe data processing broker network 
having a plurality of broker processing apparatuses each of 
which has an input for receiving published messages directly 
from a publisher application and/or receiving subscription 30 
data from a subscriber application, a method carried out by 

a first broker data processing apparatus, the method com- 
prising steps of: 
receiving a data message published on a first topic by a 
first publisher application; and ^ 
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forwarding the received published data message to a 
subscriber application which has requested, by entering 
subscription data, to receive a message on the first 
topic; 

wherein the first broker data processing apparatus sends a 
declaration to at least one other broker data processing 
apparatus of said plurality of broker data processing 
apparatuses declaring that the first broker data process- 
ing apparatus is the only broker data processing appa- 
ratus that is directly communicating with a publisher 
application that is publishing on the first topic. 
7. In a publish/subscribe data processing broker network 
having a plurality of broker data processing apparatuses 
each of which has an input for receiving published messages 
directly from a publisher application and/or receiving sub- 
scription data from a subscriber application, a computer 
program product embodied on a computer readable storage 
medium for, when run on a computer, carrying out a method 
on a first broker data processing apparatus, the method 
comprising steps of: 
receiving a data message published on a first topic by a 

first publisher application; and 
forwarding the received published data message to a 
subscriber application which has requested, by entering 
subscription data, to receive a message on the first 
topic; 

wherein the first broker data processing apparatus sends a 
declaration to at least one other broker data processing 
apparatus of said plurality of distribution agent data 
processing apparatuses declaring that the first broker 
data processing apparatus is the only broker data pro- 
cessing apparatus that is direcdy communicating with a 
publisher application that is publishing on the first 
topic. 
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